Summary

Findings

Nachos would be an excellent product offering for Chipotle and should be sold in all regions.

  • Nachos can help grow sales and attract new customers to Chipotle. Historical data provides strong proof-of-concept, showing that nachos can add to overall sales as an additional menu offering. Accross all restaurants during the sample period, gross nacho sales were approximately $45k, slightly behind sales of quesadillas and slightly ahead of salads.

  • Few additional costs are required to add nachos to the menu. Nachos’ key ingredients are simple remixes of ingredients are already available at the service counter (lettuce, cheese, etc.). By adding such a popular menu item to its strong existing offerings, Chipotle has an opportunity to enhance its brand value among current customers and help bring in new customers through our doors.

  • A deep dive into optimal pricing can help set the best price to maximize revenue. The dataset provided contains a range of index prices and gross sales amounts. Subsequent investigations could use these data points as initial estimates for a target price level based on supplementary factors such as local sourcing costs. Note, however, that nachos’ similarity to Chipotle’s existing menu offerings may cause some customers simply to switch between products, substituting nachos for another item they would otherwise normally order. Further research could help shed light on this issue.


Overview

This notebook analyzes transaction-level data for Chipotle restaurants to help decide whether to roll out nachos as a menu offering across all Chipotle locations.

  • Basic data pre-processing steps were applied to the dataset, which reduced the overall number of records. The dataset started with roughly 912k records. Around 140k were removed to restrict the analysis to in-store, non-gift-card orders that were categorized as either dine-in, take-out, or waste. Two restaurants that did not offer nachos (locations 2403 and 6004) were also excluded from the analysis, as were a small number of outliers (e.g., order quantity greater than 1). The final number of records in the dataset was 561833.

  • Some inconsistencies were found in the data compared to descriptions provided in the assignment. The assignment states that Chipotle introduced Nachos to their menu in 2 markets on 10/1/2018. However, virtually all of the records provided contain nacho sales, and the dataset gives no indication of restaurant location or region. Additionally, individual transaction records start as early as September and continue into December, although the assignment states that the nacho campaign only ran until November. Ongoing dialogue and collaboration between the analytics research team and stakeholders around the business will help clarify these issues.


Analysis

This section explains the data processing steps.

  1. Loaded comma-separated file into memory (n=918412).

  2. Identified and removed the following types of records (n=142105)
  • Fulfillment type not in-store (i.e., 3rd party delivery or catering)
  • Order type neither dine-in, take-out, nor waste
  • Item name contains the words “gift card”
  1. Modified dataset to facilitate analysis
  • Disaggregated order date
  • Combined categorically similar items (e.g., kid’s, extras)
  1. Discovered that only two restaurants (2403 and 6004) had no nacho sales data at all
  • Removed these two because comparisons would be misleading (n=5963 and n=8893, respectively)
  • Future research could reintegrate these records pending further clarification

file_path <- "/Users/mj/Documents/Jobs/Chipotle/Assignment/nachos_sample_interviewees.csv"

rawdata <- read_csv(file_path) %>% arrange(RestaurantNumber, CheckDate)

filter1 <- 
  read_csv(file_path) %>% 
  filter(FulfillmentTypeName != 'In Store' |
           OrderModeName %in% c('DINE IN', 'TAKE OUT', 'WASTE') == FALSE |
           str_detect(ItemName, "Gift Card*") == TRUE)

df <- 
  read_csv(file_path) %>% 
  # sort
  arrange(
    RestaurantNumber, 
    CheckDate
    ) %>%
  # filter/subset
  filter(
    FulfillmentTypeName == 'In Store' &
    OrderModeName %in% c('DINE IN', 'TAKE OUT', 'WASTE') &
    str_detect(ItemName, "Gift Card") == FALSE
    ) %>% 
  # modify
  mutate(
    Month = month.abb[lubridate::month(CheckDate)],
    Week = dense_rank(lubridate::week(CheckDate)),
    WeekDay = weekdays(CheckDate),
    OrderModeName = as.factor(OrderModeName),
    Item = str_split(ItemName, '-', simplify = TRUE)[,1],
    Item = case_when(
      str_detect(Item, "Kid's*") == TRUE ~ "Kid's",
      str_detect(Item, "Extra*") == TRUE ~ "Extra",
      str_detect(Item, "Taco*") == TRUE ~ "Tacos",
      str_detect(Item, "Quesadilla*") == TRUE ~ "Quesadilla",
      str_detect(Item, "Side*") == TRUE ~ "Side",
      Item == "Boorito" ~ "Burrito", 
      TRUE ~ Item),
    SaleType = if_else(str_detect(ItemName, 'Nacho*'), "Nachos", "Other Item")
    ) %>% 
  # drop extra columns
  select(
    -SalesChannelName,
    -SalesChannelType,
    -DigitalChannelName,
    -FulfillmentTypeName,
    -ItemID,
    -ItemName
  )

# Show that we only have two restaurants that don't sell nachos
tbl <- df %>%
  group_by(RestaurantNumber) %>%
  mutate(
    StoreType = case_when(
      n_distinct(SaleType) == 1 ~ "Doesn't Sell Nachos",
      n_distinct(SaleType) == 2 ~ "Sells Nachos",
      TRUE ~ NA_character_)) %>%
  distinct(RestaurantNumber, StoreType) %>% 
  inner_join(
    df %>%
      group_by(RestaurantNumber, WeekDay) %>%
      summarise(
        GrossSales = sum(GrossSalesAmount),
        AverageSale = mean(GrossSalesAmount)
        ))
print("Here are total and average sales per day-of-week by restaurant.")
[1] "Here are total and average sales per day-of-week by restaurant."
tbl
print("These restaurants did not sell nachos during this period.")
[1] "These restaurants did not sell nachos during this period."
nonacho <- df %>% filter(RestaurantNumber %in% c(2403, 6004))
nonacho
print("These are dropped records: either they belong to restaurants that did not sell nachos, have quantity greater than 1, or $0 gross sales amount.")
[1] "These are dropped records: either they belong to restaurants that did not sell nachos, have quantity greater than 1, or $0 gross sales amount."
filter2 <- 
  df %>% # Note we read the filtered dataframe from above, not the raw data
  filter(Quantity > 1 | RestaurantNumber %in% c(2403, 6004) == TRUE | GrossSalesAmount == 0)
filter2
# Do final dataset modifications
df <- df %>% filter(
  Quantity <= 1 &
  RestaurantNumber %in% c(2403, 6004) == FALSE &
  GrossSalesAmount != 0
)

# Get record counts
n0 <- dim(rawdata)[1]
n1 <- dim(filter1)[1]
n2 <- dim(filter2)[1]
n3 <- dim(df)[1]
n4 <- n1
# Verify counts match
# n0 == n1 + n2 + n3

print("This is the dataset used for analysis.")
[1] "This is the dataset used for analysis."
df

Charts

This section visualizes item price, total sales and units sold by item during the nacho campaign.

  • Nachos generate positive sales volume, despite being the highest priced menu item. The figures below illustrate the high average price charged for nachos during the campaign–slightly more than salads, bowls, and burritos. Interestingly, despite its higher price, the total number of nachos sold is comparable to salads.

# Counts of items by type
df %>%
  select(Item, SaleType) %>% 
  # gather(key, value, -SaleType) %>%
  ggplot() +
  geom_bar(
    mapping = aes(
      col = SaleType,
      fct_relevel(fct_rev(fct_infreq(fct_lump(Item, n = 15))), "Other"))
    ) + 
  # facet_wrap(~key, scales = "free") + 
  xlab(label = "") +
  theme_classic() + 
  coord_flip() + 
  ggtitle("Number Sold")

# total gross sales by item
tot <- df %>% 
  group_by(Item, SaleType) %>% 
  summarise(GrossSales = sum(GrossSalesAmount, na.rm = TRUE)) %>% 
  arrange(desc(GrossSales))

# take top 15
tot[1:15,] %>%
  ggplot() + 
  geom_col(
    aes(
      x = reorder(Item, GrossSales), 
      y = GrossSales,  
      col = SaleType)) +
  scale_y_continuous(labels = dollar_format()) +
  xlab(label = "") +
  coord_flip() +
  theme_classic() + 
  ggtitle("Total Sales")

# use this snippet for labeling values
# +
#   geom_label(
#     aes(
#       label = GrossSales,
#       x = reorder(Item, GrossSales),
#       y = GrossSales),
#     size = 3)


# average price by item
avg <- df %>% 
  group_by(Item, SaleType) %>% 
  summarise(AveragePrice = mean(IndexPrice, na.rm = TRUE)) %>% 
  arrange(desc(AveragePrice))

# take top 15
avg[1:15,] %>%
  ggplot() + 
  geom_col(
    aes(
      x = reorder(Item, AveragePrice), 
      y = AveragePrice,  
      col = SaleType)) +
  scale_y_continuous(labels = dollar_format()) +
  xlab(label = "") +
  coord_flip() +
  theme_classic() + 
  ggtitle("Average Price")